Skip to content

Conversation

@granturing
Copy link
Contributor

What is this PR for?

This is to give Windows first-class support for running Zeppelin without the need for Cygwin or other hacks.

What type of PR is it?

Improvement

Todos

Is there a relevant Jira issue?

ZEPPELIN-647

How should this be tested?

Screenshots (if appropriate)

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? Yes

<directory>../conf</directory>
<excludes>
<exclude>interpreter.json</exclude>
<exclude>zeppelin-env.cmd</exclude>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@granturing Fix the indentations, please.

@Leemoonsoo
Copy link
Member

@granturing Appreciate for the patch!
I'll manually test it in this evening with my windows XP and share the result.

@Leemoonsoo
Copy link
Member

It works well on my windows XP with oracle jdk 1.7.0_79. great work!

@ankurmitujjain
Copy link

It works on Windows 7 as well.

@prabhjyotsingh
Copy link
Contributor

I tried it on Windows 7, works well.

Except for these two test failure in cassandra, when I do
SPARK_VER="1.6.0" HADOOP_VER="2.3" PROFILE="-Pspark-1.6 -Phadoop-2.3 -Ppyspark -Pscalding" BUILD_FLAG="package -Pbuild-distr" TEST_FLAG="verify -Pusing-packaged-distr" TEST_PROJECTS=""
mvn $BUILD_FLAG $PROFILE -B

    - should parse multi-line comment *** FAILED ***
    [6.7] parsed: Comment(This is a comment

    line1

    line2

    line3

    ) did not match the given pattern (ParagraphParserTest.scala:113)

    - should parse a block queries with comments *** FAILED ***
    [24.7] parsed: List(Comment(

    This example show how to force a

    timestamp on the query

    ), Comment(Timestamp in the past), Timestamp(10), SimpleStm(CREATE TABLE IF NOT EXISTS spark_demo.ts(key int PRIMARY KEY, value text);), SimpleStm(TRUNCATE spark_demo.ts;), Comment(Force timestamp directly in the first INSERT), SimpleStm(INSERT INTO spark_demo.ts(key,value) VALUES(1,'val1') USING TIMESTAMP 100;), Comment(Selec   t some data to loose some time), SimpleStm(SELECT * FROM spark_demo.albums LIMIT 100;), Comment(Use @timestamp value set at the beginning(10)), SimpleStm(INSERT INTO spark_demo.ts(key,value) VALUES(1,'val2');), Comment(Check the result), SimpleStm(SELECT * FROM spark_demo.ts WHERE key=1;)) did not match the given pattern (ParagraphParse   rTest.scala:427)

@felixcheung
Copy link
Member

@doanduyhai is cassandra supported on windows?

@doanduyhai
Copy link
Contributor

@felixcheung

Yes, Cassandra is supported on Windows.

The error is about parsing multi line comments. The first test failure looks interesting. Basically it says that the following test fails:

  "Parser" should "parse multi-line comment" in {
    val query:String =
      """/*This is a comment
        |line1
        |line2
        |line3
        |*/
      """.stripMargin

    val parsed = parser.parseAll(parser.multiLineComment, query)
    parsed should matchPattern {
      case parser.Success(Comment("This is a comment\nline1\nline2\nline3\n"), _) =>
    }
  }

because it does not match the regexp pattern: (?s)/\*(.*)\*/ e.g:

  • (?s) = DOT ALL mode
  • \* = start with \*
  • (.*) = any character
  • */ = end with */

I'm wondering if it is not related to the difference between Linux/Unix and Windows for the line terminator

@doanduyhai
Copy link
Contributor

Let me checkout the project on Windows and try (I need to find a Windows :D)

@felixcheung
Copy link
Member

The last CI failure is because of spark release download timeout.
I think we are good to go - how about we track the Windows Cassandra test failure in a new JIRA @prabhjyotsingh @doanduyhai ?

@granturing
Copy link
Contributor Author

Hi Felix, agree with the Cassandra test failure. I have a couple more things I want to verify and then we should be good to go. I'm writing up some docs related to Windows deployment that will be part of this PR.

@Leemoonsoo
Copy link
Member

+1 for track the Windows Cassandra test failure in a new JIRA

@granturing
Copy link
Contributor Author

So, interestingly enough I only see an issue with Spark (using Hadoop 2.6). Running Flink seems fine, but it could be dependent on the Hadoop client versions being used by the two. But yes, the HDFS interpreter will need to have the necessary Windows dependencies. You want me to pull that line out?

@felixcheung
Copy link
Member

Hmm, interesting. Let's keep this here for now then.

@granturing
Copy link
Contributor Author

Couple questions:

  • The Shell interpreter uses bash, which obviously won't work on Windows. I've confirmed the interpreter will work with a couple changes to support the Windows CMD shell. I can either set a value to decide which interpreter to use at runtime, based on the OS, or we make it configurable. Is it worth making it configurable?
  private static final boolean isWindows = System.getProperty("os.name")
          .startsWith("Windows");

  private final String shell = isWindows ? "cmd /c" : "bash -c";
  • Units tests in the zeppelin-interpreter project have a reference to "../bin/interpreter.sh", which of course won't work on Windows. I've verified the unit tests using "../bin/interpreter.cmd". I just didn't like the fact that when I'm in Windows I can't run all the unit tests. I can either have a static variable which defines what script to use in each unit test or I add a helper class to centralize it and all unit tests reference that. What would you prefer?
  private static final String INTERPRETER_SCRIPT =
          System.getProperty("os.name").startsWith("Windows") ?
                  "../bin/interpreter.cmd" :
                  "../bin/interpreter.sh";

@granturing
Copy link
Contributor Author

I've been running this extensively the past few weeks and have nothing left to add at this point, unless there's additional feedback.

@Leemoonsoo
Copy link
Member

@granturing Thanks for great work. LGTM. Can #749 and #769 also be applied?

@dnldxn
Copy link

dnldxn commented Mar 14, 2016

The script seems to have a problem with CLASSPATH's that contain spaces (e.g. C:\Program Files\xyz\bin). I'm not knowledgeable enough with with Windows Batch scripting to determine if this a problem with the startup script or my particular environment.

The script fails in the following section of "zeppelin.cmd":

if "%CLASSPATH%"=="" (
    set CLASSPATH=%ZEPPELIN_CLASSPATH%
) else (
    set CLASSPATH=%CLASSPATH%;%ZEPPELIN_CLASSPATH%
)

The error thrown:
Error: Could not find or load main class Files\xyx\bin\

@granturing
Copy link
Contributor Author

@dnldxn thanks for testing that. I changed a few things to better handle spaces. Unfortunately, using the bundled Spark with a ZEPPELIN_HOME with spaces will not work, due to Spark issues. External Spark works fine though.

@Leemoonsoo
Copy link
Member

Is it okay to be merged?

@granturing
Copy link
Contributor Author

I have no more changes. @dnldxn were you able to validate the latest commit on your setup?

@dnldxn
Copy link

dnldxn commented Mar 18, 2016

@granturing Yep, it works great on my Windows 10 system.

@corneadoug
Copy link
Contributor

@granturing Selenium tests failed, could you close/re-open this PR to launch CI again?

@granturing granturing closed this Mar 22, 2016
@granturing granturing reopened this Mar 22, 2016
@granturing
Copy link
Contributor Author

One version failed because it couldn't download Spark, the other because of connection refused errors running API tests. My last commit was only against the Windows batch files which aren't being tested in the CI build though.

@corneadoug
Copy link
Contributor

Most of the profiles are green, including those with tests, so LGTM

@Leemoonsoo
Copy link
Member

LGTM. Merge if there're no more discussions.

@asfgit asfgit closed this in 2dc464c Mar 24, 2016
onkarshedge pushed a commit to onkarshedge/incubator-zeppelin that referenced this pull request May 11, 2016
…guration

### What is this PR for?
This is to give Windows first-class support for running Zeppelin without the need for Cygwin or other hacks.

### What type of PR is it?
Improvement

### Todos
* [x] - Fix notebook dir path handling which right now assumes URI compatible string (see https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/repo/VFSNotebookRepo.java#L63)
* [x] - Add documentation for configuring and running on Windows
* [x] - Independent code review of the CMD scripts to ensure they're correct

### Is there a relevant Jira issue?
ZEPPELIN-647

### How should this be tested?
* Pull this PR
* Build
* Override default ZEPPELIN_NOTEBOOK_DIR in zeppelin-env.cmd to be an absolute file URI such as file:///c:/notebook
* Start with bin\zeppelin.cmd
* If using any Hadoop system ensure you have winutils.exe in your HADOOP_HOME\bin, see (https://github.com/steveloughran/winutils)

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Silvio Fiorito <[email protected]>
Author: Silvio Fiorito <Silvio Fiorito>

Closes apache#734 from granturing/windows-support and squashes the following commits:

8aadd45 [Silvio Fiorito] Fixes to handle spaces in paths properly, both for ZEPPELIN_HOME and CLASSPATH
73aaf4f [Silvio Fiorito] Default to the appropriate interpreter when running on Windows
db28fe9 [Silvio Fiorito] Support for running unit tests on Windows using the appropriate interpreter script
a1e3097 [Silvio Fiorito] Support for Windows CMD shell interpreter
82acdcf [Silvio Fiorito] Merge branch 'master' into windows-support
9e8b309 [Silvio Fiorito] Initital doc updates for running on Windows
03baf62 [Silvio Fiorito] Additional fix for embedded pyspark environment variables
2b9f01c [Silvio Fiorito] Fix for pyspark PYTHONPATH environment variable not being set properly due to delayed expansion
c700808 [Silvio Fiorito] Check for Windows path before creating URI to prevent URISyntaxExecption
d30e4b9 [Silvio Fiorito] And again fix indentations missed last time
5b49d3e [Silvio Fiorito] Cleaned up indentation
9e40482 [Silvio Fiorito] Initial support for Windows platform, startup scripts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants